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Abstract 

Horizontal gene transfer (HGT) of DNA from the plastid to the nuclear and mitochondrial genomes of higher plants is a common 
phenomenon; however, plastid genomes (plastomes) are highly conserved and have generally been regarded as impervious to HGT. 
We sequenced the 1 58 kb plastome and the 690 kb mitochondrial genome of common milkweed (Asdepias syriaca [Apocynaceae]) 
and found evidence of intracellular HGT for a 2.4-kb segment of mitochondrial DNA to the rps2-rpoC2 intergenic spacer of the 
plastome. The transferred region contains an rpl2 pseudogene and is flanked by plastid sequence in the mitochondrial genome, 
including an rpoC2 pseudogene, which likely provided the mechanism for HGT back to the plastome through double-strand break 
repair involving homologous recombination. The plastome insertion is restricted to tribe Asclepiadeae of subfamily Asclepiadoideae, 
whereas the mitochondrial rpoC2 pseudogene is present throughout the subfamily, which confirms that the plastid to mitochondrial 
HGT event preceded the HGT to the plastome. Although the plastome insertion has been maintained in all lineages of 
Asclepiadoideae, it shows minimal evidence of transcription in A. syriaca and is likely nonfunctional. Furthermore, we found 
recent gene conversion of the mitochondrial rpoC2 pseudogene in Asdepias by the plastid gene, which reflects continued interaction 
of these genomes. 
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Introduction 

Horizontal gene transfer (HGT) is the phenomenon in which 
genetic material is transmitted laterally between organisms or 
between genomes within organisms, rather than vertically 
though sexual reproduction (Keeling and Palmer 2008; Bock 
2010; Renner and Bellot 2012). This process can occur irre- 
spective of the relatedness of the organisms and occurs 
frequently between prokaryotes and eukaryotes (Keeling 
and Palmer 2008; Bock 2010). HGT between genomes in 
plants takes place through intracellular transfer of DNA 
among the nuclear, plastid, and mitochondrial genomes. In 
land plants, HGT of plastid DNA (ptDNA) to the mitochondrial 
genome and transfer of ptDNA and mitochondrial DNA 



(mtDNA) to the nuclear genome are well documented 
(Richardson and Palmer 2007; Keeling and Palmer 2008; 
Bock 2010; Smith 201 1; Renner and Bellot 2012). However, 
the highly conserved land plant plastome has long been con- 
sidered "essentially immune" to HGT (Richardson and Palmer 
2007, p. 7; Smith 201 1), until the recent reports of a single 
case of movement of mtDNA into the plastome (Goremykin 
et al. 2009; lorizzo et al. 2012a, 2012b). 

In the single confirmed case of mitochondrial to plastid 
HGT, mtDNA including a putative mitochondrial coxl pseu- 
dogene was discovered in the cultivated carrot {Daucus 
carota) plastome by Blast analyses during the characterization 
of the grape mitochondrial genome (Goremykin et al. 2009) 
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and confirmed by sequencing of the carrot mitochondrial 
genome (lorizzo et al. 2012b). lorizzo et al. (2012a) proposed 
transposition via a non-long terminal repeat (LTR) retrotran- 
sposon as the mechanism of HGT and by surveying the plas- 
tomes of other members of Apiaceae found that six additional 
species of Daucus and its close relative, cumin (Cuminum 
cyminum), share the DNA transfer from the mitochondrial to 
plastid genomes. 

A putative mitochondrial to plastid HGT of an rp/2 pseudo- 
gene was detected by Ku et al. (2013) in the common milk- 
weed (Asdepias syriaca L: Apocynaceae) based on Blast 
similarity searches. Here, we confirm transfer of mtDNA, in- 
cluding the rp/2 pseudogene (isrpl2), into the plastome of 
A. syriaca based on the full genome sequences of the plas- 
tome and mitochondrial genome for this species. 
Furthermore, we suggest that homologous recombination 
mechanisms that function to repair double-strand breaks in 
plastomes enabled the movement of mtDNA into a highly 
conserved plastome, a process that was facilitated by the pres- 
ence of plastid sequence in the mitochondrial genome from 
earlier HGT events. We place this gene transfer event in an 
evolutionary context by showing the phylogenetic distribution 
of the insertion across subfamily Asclepiadoideae and deter- 
mine that it occurred in the common ancestor of tribe 
Asclepiadeae and demonstrate that the exogenous ptDNA 
has been retained relatively intact in all daughter lineages. 
Through additional phylogenetic analyses of homologous 
sequences in the plastid and mitochondrial genomes, we de- 
termined that there has been ongoing exchange between the 
two genomes leading to gene conversion of a mitochondrial 
pseudogene (i/rpoC2) in Asdepias. Finally, by means of tran- 
scriptome sequencing for A. syriaca, we show that only a small 
portion of the horizontally transferred segment, not including 
Vrp/2, is transcribed leading us to suggest that most of the 
segment lacks function, although the expressed ends may 
have some regulatory role. 

Materials and Methods 

lllumina Library Preparation and Sequencing 

We extracted DNA using one of the following kits or methods: 
FASTDNA kit (MP Bio), DNeasy kit (Qiagen), or the cetyl tri- 
methylammonium bromide method (Doyle and Doyle 1987). 
We prepared lllumina libraries for A. syriaca and 1 1 other 
Asclepiadoideae (table 1) following (Straub et al. 2011) or 
using the gel-free size selection protocol described in Straub 
et al. (2012) for Asdepias nivea and Secamone afzelii. We 
obtained mate pair libraries for A syriaca with insert sizes of 
2, 3.6, 4.5, and 5kb from Global Biologies (Columbia, MO). 
We sequenced either single-end or paired-end lllumina short 
reads (36, 80, or 101 bp) for each library on lllumina HiSeq 
2000, lllumina GAIIx, or lllumina MiSeq sequencers at either 
the Oregon State University Center for Genome Research and 



Biocomputing (OSU-CGRB) or the Oregon Health and Science 
University. 

Sequence Assembly and Analysis 

We used Alignreads v. 2.25 (Straub et al. 2011) and the 
A. syriaca reference plastome we previously sequenced 
(GenBank: JF433943.1) with one copy of the inverted 
repeat removed for reference-guided plastome assembly of 
a random subset of 5 million reads from the A. syriaca data 
set. We conducted Blast (Altschul et al. 1 997) searches (BlastN) 
against the GenBank nucleotide database to identify regions 
of similarity to plant mitochondrial genomes in the insert 
region. We used Velvet 1.0.12 (Zerbino and Birney 2008) 
and Velvet Optimiser (http://bioinformatics.net.au/software. 
velvetoptimiser.shtml, last accessed July 4, 2013) with a 
hash length of 71, expected coverage of 31, and coverage 
cutoff of 0.326 to perform de novo assembly of the mitochon- 
drial genome of A. syriaca. We trimmed the A. syriaca mate 
pair reads to 30 bp (2, 3.6, 4.5 kb libraries — 101 bp original 
length) or removed bases below Q30 on either end of the 
reads and retained reads >30 bp (5 kb library — 36 bp original 
length) using Trimmomatic v. 0.20 (Lohse et al. 2012). The 
trimmed mate pair reads were utilized in SSPACE 2.0 (Boetzer 
et al. 201 1) to scaffold the mitochondrial de novo contigs. We 
used Geneious 6.1.4 (Biomatters Ltd.) to design polymerase 
chain reaction (PCR) primers (supplementary table S1, 
Supplementary Material online) to fill gaps between scaffolds 
using standard Sanger dye-termination sequencing (Applied 
Biosystems). We further used the PCR results to order scaffolds 
into a master circle. Undetermined sequence in the Velvet 
assembly was ascertained either by read mapping in 
Geneious or reference-guided assembly in Alignreads. 

We identified homologous regions between the plastome 
and mitochondrial genome using BLAT (Kent 2002) and an- 
notated these regions using Mitofy (Alverson et al. 2010). We 
aligned assembly contigs and consensus sequences with a 
selection of asterid plastome sequences available in 
GenBank (table 2) using MAFFT v. 6.857 (Katoh and Toh 
2008) and compared them in BioEdit v. 7.0.5.3 (Hall 1999) 
to determine insert sizes (length of nonhomologous regions). 
We confirmed the 5' and 3' ends of the insert sequence in the 
A. syriaca plastome by PCR (1 x Phusion® Flash High-Fidelity 
PCR Master Mix [Finnzymes], 0.2 |iM forward primer 5'-ACAC 
TCTC GTAG C G C C GTATAGTCTT-3 ' , 0.2 jaM reverse primer 5'- 
G GTTC AAAG G ATTAGTG C AC C CTTC A-3 ' , cycling conditions: 
30s 98 °C, 25 cycles of 10s 98 °C, 30s 61 °C, 90s 72 °C, 
extension 1 0 min 72 °C) followed by standard Sanger dye-ter- 
mination sequencing. 

We assembled the A. nivea plastome sequence using a 
combination of de novo and reference-guided assembly fol- 
lowing (Straub et al. 201 1) and using the A. syriaca reference 
(GenBank: JF433943.1). For other Asclepiadoideae, we made 
reduced random read pools providing approximately 125x 
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Table 2 

List of Asterid Plastome Sequences Downloaded from NCBI and Aligned to Determine the Size of the Typical Asterid rps2-rpoC2 Intergenic Spacer 
and Stop Codon 



Order 


Family 


Species 


NCBI Accession 


Size 


Canonical Stop 








Number 


of rps2-rpoC2 (bp) 


Codon Utilized? 


Apiales 


Apiaceae 


Anthriscus cerefolium 


NC_015113.1 


220 


Yes 






Daucus carota 


NC_008325.1 


209 


Yes 




Araliaceae 


Panax ginseng 


NC_006290.1 


212 


Yes 


Asterales 


Asteraceae 


Ageratina adenophora 


NC_01 5621.1 


260 


No 






Guizotia abyssinica 


Mr mnfim 1 

IML._U I UOU I . I 


Z'+O 


IMO 






Helianthus annuus 


Mr nn7Q77 1 




l\JO 






Jacobaea vulgaris 


Mr m 1 


771 
ZZ I 


IMO 






LdLLULd bdLlvd 


Mr 007^78 1 


7ZL1 
ZH- 1 


l\l(J 






Parthenium argentatum 


Mr ni 31^3 1 


7SS 
zoo 


NO 


Ericales 


Ericaceae 


Vaccinium macrocarpon 


NC_019616.1 


262 


No 




rl II 1 lUldLfcrdc 


/-KlUlbld fJUIybULLd 


Mr co'\ 

l\V_ UZ I I Z I . I 


221 


Vpc 




Theaceae 


Camellia sinensis 


Mr n?nniQ 1 

l\lv__UZUU I V. I 


7/17 


l\J(_> 


Gentianales 


Rubiaceae 


Coffea arabica 


NC_008535.1 


213 


Yes 


Lamiales 


Lamiaceae 


Salvia miltiorrhiza 


NC_020431.1 


208 


Yes 






Tectona grandis 


HF567869.1 


208 


Yes 




Oleaceae 


Jasminum nudiflorum 


NC_008407.1 


258 


Yes 






Olea europaea 


NC_01 3707.2 


208 


Yes 






Olea woodiana 


NC_01 5608.1 


208 


Yes 




Pedaliaceae 


Sesamum indicum 


KC569603.1 


209 


Yes 


Solanales 


Convolvulaceae 


Ipomoea purpurea 


NC_009808.1 


213 


Yes 




Solanaceae 


Atropa belladonna 


NC_004561.1 


221 


Yes 






Nicotiana sylvestris 


NC_007500.1 


227 


Yes 






Nicotiana tabacum 


NC_001 879.2 


227 


Yes 






Nicotiana tomentosiformis 


NC_007602.1 


222 


Yes 






Nicotiana undulata 


NC_01 6068.1 


227 


Yes 






Solanum bulbocastanum 


NC_007943.1 


228 


Yes 






Solanum lycopersicum 


NC_007898.2 


225 


Yes 






Solanum tuberosum 


NC_008096.2 


225 


Yes 



sequencing depth of the plastome, or used all reads if 
<125x depth was obtained, and then conducted refer- 
ence-guided assembly in Alignreads using the A. nivea se- 
quence as a reference, the medium similarity setting, and 
the masking parameters and evaluation criteria of Straub 
et al. (2012). We performed a second iteration of this analysis 
using the result of the first assembly as the reference in cases 
where the whole insert region was not assembled in a single 
contig during the first round of assembly. 

Sequencing Depth Estimation in A. syriaca 

We removed duplicate sequences from the read pool to elim- 
inate PCR duplicates that could potentially inflate the observed 
sequencing depth using custom python scripts for paired-end 
reads and the FASTX-toolkit (http://codex.cshl.org/labmem- 
bers/gordon/fastx_toolkit/, last accessed July 24, 2013) for 
single-end reads. We then used the assembled A. syriaca plas- 
tome and mitochondrial genome sequences to map reads in 



BWA v. 0.5.7 (Li and Durbin 2009). To exclude reads that may 
have originated in the other genome, we used only perfect 
matches to construct read pileups using SAMtools v. 0.1.7 
(Li et al. 2009). We used the pileups to determine the 
number of reads with starts mapping in the insert regions. 
To ensure that only reads from the correct genome were 
counted, in regions of 1 00% sequence similarity we excluded 
the read starts for reads wholly fitting into the window of 
similarity from the counts. This estimate is conservative 
because reads that did originate from the genome in question 
will have been eliminated. We then sampled six sequence re- 
gions of the same size as the inserted segment from the sur- 
rounding areas of the plastome. Each was separated from the 
insert segment and each other sample by the read length 
(80 bp), so that no read was present in two windows. We 
counted read starts in these windows and used them to con- 
struct a 95% confidence interval for the number of reads 
expected to be found for each genome per sequence 
window. 
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PCR Screening for the Insert in Asclepiadoideae 
Plastomes and the rpoC2 Pseudogene in Asclepiadoideae 
Mitochondrial Genomes 

To screen for the presence of the insert in the rps2-rpoC2 
intergenic spacer in other Asclepiadoideae, we used the 
same PCR primers used in A. syriaca and the same PCR con- 
ditions with the exception of lowering the annealing temper- 
ature to either 55 °C or 50 °C to achieve amplification in some 
species. 

To screen for the presence of an rpoC2 pseudogene in the 
mitochondrial genomes of Asclepiadoideae, we designed one 
PCR primer in the most divergent region between the plastid 
rpl2 pseudogene and mitochondrial rpl2 (5'-GCGATTGGTCTA 
AACCTTCGA-30- To further increase specificity, we incorpo- 
rated a penultimate mismatch (Cha et al. 1992). We designed 
three reverse primers using the A. syriaca mitochondrial rpoC2 
as a guide (R1 : 5 r -AC C ATTG ATC ATTTTGTTATC ATC C A-3 r ; R2: 
5'-ACATATGAAATAGCGGACGGTCTA-3'; R3: 5'-TGTCAAAG 
AGGCGAAGAAATCT-30. We used the PCR conditions given 
above modified by decreasing the annealing temperature to 
55 °C and increasing the number of cycles to 30. 

Sequencing and Assembly of rpoC2/\jrrpoC2 

To informatically assemble \fsrpoC2, we used BLAT to identify 
reads that hit the plastid rpoC2 sequence of each species for 
which we collected lllumina data. We determined the 
sequence of \j/rpoC2 by using the plastid sequence as a refer- 
ence and mapping reads with one or more mismatches or 
those supporting indels versus the plastid sequence in 
Geneious using the Geneious assembler with custom sensitiv- 
ity settings to allow gaps with a maximum of 50% per read 
and size of 60 bp, a word length of 24 with words repeated 
more than eight times ignored, a maximum of 25% mis- 
matches per read, an index word length of 14, and a maxi- 
mum ambiguity of four. We mapped multiple best matches 
randomly and did not use the fine tuning option. We detected 
gaps larger than those recovered by read mapping with BLAT. 
We checked portions of the informatically determined 
sequence for accuracy using primers R1 and R2 as sequencing 
primers in Eustegia and Telosma, and R3 and an additional 
primer (R4: 5 / -TCTAATGGAAAAAGCAAATTGAATGA-3 / ) in 
A. syriaca, and employed the \j/rpoC2 PCR protocol and stan- 
dard Sanger dye-termination sequencing described above. 
We used the same read mapping approach to detect variants 
of the rps2-rpoC2 intergenic spacer across Asclepiadoideae 
that could correspond to the homologous region of the mito- 
chondrial genome if present in species other than A. syriaca. 

Phylogenetic Analyses 

We prepared a matrix for phylogenetic analysis by aligning 
plastome sequences in MAFFT v. 6.864b. Due to the difficulty 
in assembling putative pseudogenes accD and ycfl in 
Asclepias (Straub et al. 2011), we excised these regions 



from the alignment. We used GBIocks v. 0.91b (Talavera 
and Castresana 2007) with default settings (minimum of 
seven and ten sequences for a conserved position and a 
flank position, respectively; maximum of eight nonconserved 
positions; minimum length of 10 bp in a block), modified by 
allowing positions where half of the sequences contained 
gaps, to remove unalignable and poorly aligned regions of 
the matrix. We analyzed this final matrix (supplementary 
data set S1, Supplementary Material online) by utilizing 
CIPRES resources (Miller et al. 2010) to run RAxML v. 7.6.3 
(Stamatakis 2006; Stamatakis et al. 2008) with the following 
settings: -p 6540444 -x 18884 -N 1000 -k -f a -m GTRCAT. 
The maximum likelihood (ML) search and 1,000 rapid boot- 
strapping replicates were performed in a single run. The tree 
was rooted using S. afzelii, a member of subfamily 
Secamonoideae, the sister subfamily of Asclepiadoideae, as 
an outgroup. 

We prepared a second matrix for phylogenetic analysis by 
aligning the sequences of rpoC2 from the plastomes of each 
species, the informatically assembled i/rpoC2 sequence from 
each species, and an outgroup plastid rpoC2 sequence from 
Nerium oleander (GenBank: GQ997692.1), a member of 
Apocynoideae (another subfamily of Apocynaceae), using 
MAFFT. We conducted a phylogenetic analysis of the final 
matrix (supplementary data set S2, Supplementary Material 
online) using RAxML through CIPRES with the following 
settings: -p 1648144 -x 2465477 -N 5000 -k -f a -m 
GTRGAMMA. The ML search and 5,000 rapid bootstrapping 
replicates were performed in a single run. 

Chloroplast Transcriptome Sequencing and Analysis 

We mechanically homogenized fresh frozen leaves (-200 mg) 
from A. syriaca (W. Phippen 5 (OSC)) on dry ice in a FastPrep- 
24 bead mill (MP Bio). We added 1.5 ml of cold extraction 
buffer (3 M LiCI/8M urea; 1% PVP K-60; 0.1 M dithiothreitol 
[Tai et al. 2004]) to the ground tissue, homogenized the tissue, 
and pelleted cellular debris at 200 x g for 10min at 4°C. The 
supernatant was incubated overnight at 4°C, RNA was pel- 
leted by centrifugation (20,000 x g for 30min at 4°C), and 
cleaned using the ZR Plant RNA MiniPrep kit (Zymo Research). 
We prepared an RNA-Seq library using 2|ig total RNA, 
followed by ribosomal RNA subtraction (RiboMinus™; Life 
Technologies), and standard TruSeq reagents (lllumina Inc.) 
modified to enable strand-specific sequencing by dUTP incor- 
poration (Parkhomchuk et al. 2009). In this approach, second 
strand synthesis is supplemented with a dUTP/dNTP mixture 
(Fermentas ThermoScientif ic) in place of standard dNTPs. Prior 
to enrichment PCR, uracil-containing sites are degraded with a 
uracil-specif ic excision reagent mixture (New England Biolabs). 
We enriched the library using standard TruSeq amplification 
primers, and a 12-pM aliquot was sequenced at OSU-CGRB 
using an lllumina HiSeq 2000 to obtain 101 bp single-end 
reads. 
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We removed reads not passing the lllumina chastity and 
purity filters and trimmed all reads to 100 bp. We used 
Trimmomatic to trim leading and trailing bases less than 
Q20, to trim the rest of the sequence when the average qual- 
ity in a sliding window of 5 bp fell below Q30, and to exclude 
reads <36bp following trimming, which resulted in a final 
data set of 24.6 M reads. We determined expression of the 
insert region and flanking operons by mapping the quality- 
filtered reads back to the plastome reference sequence using 
BWA. To estimate sequencing depth, we calculated the aver- 
age depth in a sliding window of 20 bp across the atpl-rpoC2 
region. We also calculated reads per kilobase of exon model 
per million mapped reads (RPKM) values (Mortazavi et al. 
2008) separately for sense and antisense reads mapped to 
the insert region and flanking genes using Artemis v. 15.1.1 
(Carver et al. 201 2) and BamView v. 1 .2.9 (Carver et al. 201 3). 

Results 

Sequencing and Assembly of the A syriaca Plastome and 
Mitochondrial Genome Confirm Intracellular HGT 

Sequencing and assembly of a 1 58,71 9-bp A. syriaca plas- 
tome (GenBank: KF386166) confirmed a 2,427-bp insertion 
in the rps2-rpoC2 intergenic spacer (2,677 bp total length; 
fig. ^A) relative to other angiosperms in the asterid clade 
(ca. 80,000 species). This insertion is also present in the 



plastome of a second A. syriaca individual (GenBank: 
JF433943.1) that we previously sequenced (Straub et al. 
201 1 ; Ku et al. 201 3), and this region in the newly sequenced 
individual differs by the insertion of 21 bp, 16 of which are 
repeated upstream of the insertion, and deletion of 21 bp that 
comprises a direct repeat in the first individual. The rps2- 
rpoC2 intergenic spacer ranges from 208 to 327 bp in 28 
other photosynthetic asterid species with plastomes in 
GenBank (table 2). Blast (Altschul et al. 1997) searches of 
the GenBank nucleotide database using the A. syriaca rps2- 
rpoC2 spacer region as a query returned high confidence hits 
to plant mitochondrial genomes, including grape (Wis vinif- 
era), papaya (Carica papaya), and watermelon (Citrullus lana- 
tus). The putatively homologous mitochondrial region includes 
rp/2, confirming that the A syriaca plastome contains a pseu- 
dogene (i/rpl2) consisting of the second exon of mitochondrial 
rpl2. Assembly of the approximately 690 kb milkweed mito- 
chondrial genome from the same individual that we used for 
plastome sequencing revealed that it contains sequences ho- 
mologous to the inserted segment in the plastome (fig. 1A), 
which are split between two regions of 1,091 and 1,401 bp, 
and separated by 55,845 bp in the master circle of the con- 
temporary A syriaca mitochondrial genome. 

To confirm that the appearance of mitochondrial sequence 
in the plastome was not the product of misassembly, we uti- 
lized the quantitative nature of short read sequencing and the 
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fact that ptDNA is typically an order of magnitude more abun- 
dant than mtDNA in Asdepias genomic DNA extractions 
(Straub et al. 2011; Straub et al. 2012). A total of 72,004 
reads mapped to the mtDNA insert in the plastome, following 
a correction for reads that could have originated from the 
mitochondrial genome due to 100% sequence similarity in 
blocks that exceed the read length. This value falls within 
the 95% confidence interval (65,947-76,521) of reads 
mapped to equal-sized partitions of the surrounding single- 
copy portion of the plastome and differs markedly from the 
corrected value of 3,509 reads mapped to the homologous 
regions of the mitochondrial genome. The inserted region 
therefore has similar sequencing depth to the surrounding 
plastome, rather than the mitochondrial genome, which dem- 
onstrates that it is not an assembly artifact (fig. IB). 

The plastome insert and the mitochondrial sequence have a 
pairwise sequence divergence of 0.072 substitutions per site 
and differ by 29 indels with a total length of 251 bp. The 
pairwise sequence divergence between \jrrpi2 and the mito- 
chondrial rpl2 is 0.089 substitutions per site. Both the plastid 
and mitochondrial sequences have the conserved stop codon 
characteristic of core eudicots that separates the portion of 
the gene encoded in the mitochondrial genome (5' rpl2 sensu 
Adams et al. 2001) and the portion presumably transferred to 
the nuclear genome (Adams et al. 2001). In the translated 
pseudogene sequence, five of the substitutions relative to 
mitochondrial rpl2 are nonsynonymous, and one is a nonsense 
mutation that would lead to a truncation 1 6 amino acids prior 
to the conserved end of 5 r rpl2. In addition, the homologous 
regions of the mitochondrial genome are flanked by sequence 
of plastid origin, a partial rps2-rpoC2 intergenic spacer on the 
5' end and an rpoC2 pseudogene ii/rpoC2) on the 3' end 
(fig. 1A), which has numerous missense, nonsense, and 
frameshift mutations. We confirmed the presence of the 
insert in the plastome by PCR amplification and sequencing 
of the insert region using primers anchored in the flanking 
plastid genes, one of which {rps2) does not have a pseudo- 
gene in the milkweed mitochondrial genome. 

Phylogenetic Distribution of the Mitochondrial to Plastid 
Genome HGT 

To place the transfer event into an evolutionary context, we 
surveyed 22 other species of Apocynaceae to determine 
whether they shared the mitochondrial segment of DNA in 
the rps2-rpoC2 intergenic spacer (table 1). First, we used 
PCR to screen 17 species spanning three of four 
subfamilies of Apocynaceae, with sampling concentrated in 
Asclepiadoideae, the subfamily to which A. syriaca belongs, to 
determine whether the spacer region was similar in size to 
that observed in A. syriaca, or whether it was more typical 
of other asterids (fig. 2). All members of tribe Asclepiadeae 
(subfamily Asclepiadoideae) with successful PCR amplification 
had spacer regions as large as or larger than A. syriaca. The 



large spacer region was also present in Eustegia (table 1), an 
enigmatic monotypic African genus that, while possessing 
synapomorphic features of tribe Asclepiadeae (Bruyns 1999), 
has been resolved in phylogenetic analyses of plastid sequence 
data alternatively as sister to the rest of the Asclepiadeae 
(Liede 2001) or as the sister group of the tribes Marsdenieae 
and Ceropegieae (Rapini et al. 2003; Meve and Liede 2004; 
Rapini etal. 2007). Members of tribes Fockeeae, Marsdenieae, 
and Ceropegieae of Asclepiadoideae and subfamilies 
Secamonoideae, Periplocoideae, and Apocynoideae did not 
show evidence of an insertion in the spacer region (table 1). 

We selected 1 1 species for whole plastome sequencing and 
phylogenetic analysis to determine whether the large spacers 
of tribe Asclepiadeae contained mitochondrial inserts and the 
order of evolutionary events leading to the mtDNA transfer to 
the plastome. As expected, we detected inserts ranging in size 
from 2,427 to 4,71 5 bp in the rps2-rpoC2 intergenic spacer in 
the plastome assemblies of all species of Asclepiadeae, includ- 
ing Eustegia, but not in other Apocynaceae (figs. 2 and 3). 
Comparison of the inserts of other Asclepiadeae with the 
A. syriaca mitochondrial genome revealed a segment of 
approximately 400 bp that is absent from the plastome and 
mitochondrial genome of A. syriaca and does not have signif- 
icant Blast hits in GenBank. The phylogenetic analysis of the 
plastome sequences (1 1,933 variable of 1 17,206 total char- 
acters; -In likelihood -257,351.74) considering only substi- 
tutions, but not indel events, including the insertion from the 
mitochondrial genome, revealed that Eustegia is sister to the 
rest of Asclepiadeae (fig. 4) and may be retained in the tribe as 
its earliest diverging lineage. Therefore, the plastome insertion 
provides a rare genomic change that is a molecular synapo- 
morphy for Asclepiadeae. 

Comparison of the 3'-end of aligned plastid rpoC2 se- 
quences showed that multiple distinct mutations have oc- 
curred in Apocynaceae that truncate the protein by 2-18 
amino acid residues (supplementary fig. S1, Supplementary 
Material online). The length of the protein is otherwise con- 
served across most asterids, although Asteraceae and Ericales 
also exhibit some length variation (table 2). 

Phylogenetic Distribution of the Plastid to 
Mitochondrial HGT 

To explore a possible mechanism for the DNA transfer from 
the mitochondrial genome to the plastome, we used PCR to 
screen the same set of species evaluated by PCR for the pres- 
ence of the mtDNA insertion in the plastome for a i/rpoC2 
gene flanking the mitochondrial rpl2 gene in the 
mitochondrial genome. Nearly all surveyed members of 
Asclepiadoideae showed the presence of the pseudogene 
based on amplification using a mitochondrial genome-specific 
primer anchored in rpl2 and three different primers designed 
from the A. syriaca ^rrpoC2 sequence (fig. 2 and table 1). 
However, samples from subfamilies Secamonoideae and 
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Fig. 4. — ML phylogeny of the Asclepiadoideae (Apocynaceae) based 
on plastome sequences of major lineages. The numbers shown above or 
below the branches are bootstrap support values. 

Periplocoideae did not amplify using these primers, indicating 
that either the pseudogene is not present or the PCR priming 
sites are divergent or absent. A \j/rpoC2 sequence of 499 bp 
was informatically assembled for 5. afzelii, indicating that all 
reverse primer sites are in fact absent in this species. 

Pairwise divergence between rpoC2 and \j/rpoC2 in A. syr- 
iaca was 0.01 1 substitutions per site, which is much less than 
the pairwise divergence observed between the plastome 
insert and the homologous regions of the mitochondrial 
genome (fig. SA), indicating the possibility of more recent 
exchange between the two genomes. To test for evidence 
of gene conversion, we informatically assembled the 
i/rpoC2 sequences for species with sequenced plastomes 
and obtained Sanger sequences for portions of the pseudo- 
gene in A. syriaca, Eustegia, and Telosma to confirm the infor- 
matically assembled sequences. Low sequencing depth of the 
mitochondrial genome and/or the apparent presence of more 
than one rpoC2 pseudogene prevented the mitochondrial 
sequences of Araujia and A. nivea from being successfully 
assembled. Phylogenetic analysis of the successfully deter- 
mined mitochondrial sequences and the plastid homologs 
(690 variable of 4,222 total characters; -In likelihood 
-11896.01) indicated that there has indeed been recent 
recombination between the plastid gene and the mitochon- 
drial pseudogene in A. syriaca, as the mitochondrial sequence 
is strongly supported to share more recent ancestry with plas- 
tid sequences than with mitochondrial sequences from other 



species (fig. 5B). Some of the other mitochondrial sequences 
share single-nucleotide polymorphisms with the plastid 
sequence from the same species and may be mosaics contain- 
ing smaller tracts of gene conversion (Hao and Palmer 2009; 
Hao et al. 2010) than what we observed in A. syriaca and 
which are too short to affect the placement of the sequences 
in the phylogenetic analysis (Hao and Palmer 201 1). 

Due to the presence of the partial rps2-rpoC2 intergenic 
spacer observed in the A. syriaca mitochondrial genome, we 
used this sequence to look for evidence of more than one (i.e., 
a plastid and a putatively mitochondrial) copy of this sequence. 
Asclepias nivea, Araujia, Orthosia, Matelea, and Marsdenia all 
showed evidence of two sequences. The other species did not, 
but due to the short length of this sequence, it is possible that 
it is present in all species but has not accumulated mutations in 
some of them. The observation of two sequences, especially in 
Marsdenia (tribe Marsdenieae), leaves open the possibility that 
this sequence was present in the ancestral mitochondrial 
genome of Asclepiadeae along with ^rpoC2. 

Expression Analysis of the Plastome Insert Region 

We evaluated transcriptome sequences derived from leaf 
tissue of the same A. syriaca individual used in genomic se- 
quencing to determine whether the tyrpl2 exon was expressed 
at the level of mRNA. Our results show that the flanking op- 
erons from rpoC2 (including rpoB, rpod , and rpoC2) and rps2 
(including rps2, atpA, atpl, and atpH) are actively transcribed 
in leaf tissue, but that transcripts representing the insert and 
the \j/rpl2 exon are not abundant (fig. 6). The RPKM values for 
sense (and antisense) mapped reads were 3,298 (6), 854 (1), 
149 (12), and 254 (12) for atpl, rps2, the insert, and rpoC2, 
respectively. 

Discussion 

Plant plastomes are highly conserved, and intracellular gene 
transfer involving the plastome has been shown to be highly 
asymmetric. Although genetic material from plastid genomes 
is regularly transferred to plant mitochondrial and nuclear ge- 
nomes, it has long been believed that plant plastomes do not 
tolerate incorporation of foreign DNA (Richardson and Palmer 
2007; Keeling and Palmer 2008; Smith 201 1). The power of 
the quantitative nature of next-generation sequencing (NGS) 
allowed confirmation of just such a rare event, transfer of 
mtDNA into the plastome, through sequencing and assembly 
of these two genomes in common milkweed. Further sam- 
pling of other species of Apocynaceae revealed that this trans- 
fer event uniquely occurred in the common ancestor of tribe 
Asclepiadeae. Because of the conserved nature of the plas- 
tome, evidence of this intracellular HGT has been preserved 
through the divergence of thousands of species, originating 
perhaps in the late Eocene (Rapini et al. 2007). Evidence for 
the mechanism of the HGT event, as well as of continued 
interaction of the plastid and mitochondrial genomes, is 
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Fig. 5. — Gene conversion between the plastic! rpoC2 and mitochondrial \J/rpoC2 genes of A syriaca. (A) Plot of percent identity between homologous 
regions of the plastome and mitochondrial genomes of A syriaca over a 20-bp sliding window. Green indicates 100% identity, brown <100% but at least 
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pseudogene sequence in the plastid clade indicates sufficient gene conversion to affect the outcome of the phylogenetic analysis. 



present in the DNA sequences obtained from representative 
species. 

The Mechanism of the Mitochondrial to Plastome HGT in 
Milkweeds 

A suite of DNA repair mechanisms functions to maintain the 
integrity of the plastome and contribute to the conservative 
evolution of this genome, which exists in a genotoxic environ- 
ment due to photo-oxidative stress (Marechal and Brisson 
2010). Double-strand breaks in ptDNA can be repaired by 
homologous recombination through both the classical dou- 
ble-strand break repair model (DSBR) and the synthesis-de- 
pendent strand annealing model (SDSA), either of which 
can result in gene conversion if the template used differs 
from the DNA under repair (Odom et al. 2008; Marechal 
and Brisson 2010). Homologous sequences as short as 



50-1 50 bp are long enough to allow these mechanisms to 
occur (Singer et al. 1982; Watt et al. 1985; Marechal and 
Brisson 2010). One of these ptDNA repair mechanisms is 
likely responsible for the incorporation of mtDNA into the 
plastid genome in the ancestor of Asclepiadeae following a 
double-strand break in the rps2-rpoC2 intergenic spacer. 

All Asclepiadoideae (the subfamily containing Asclepia- 
deae) share the presence of i/rpoC2 flanking rpl2 in the mi- 
tochondrial genome, indicating that the ancestral 
mitochondrial genome of Asclepiadeae contained the neces- 
sary homologous sequence on one flank of the mtDNA that 
was inserted into the plastome. Presence of plastid rps2- 
rpoC2 flanking the other end of the inserted segment (as ob- 
served in the mitochondrial genome of A. syriaca and likely 
present in other Asclepiadoideae) in combination with 
i/rpoC2 could have facilitated repair of a double-strand 
break between rps2 and rpoC2 through either DSBR or 
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SDSA that used the mtDNA as a template, rather than another 
copy of the plastome, of which there are multiple copies per 
cell in most higher plants (Smith 201 1). There is also a plastid 
xj/rpoB adjacent to the spacer sequence in the mitochondrial 
genome of A. syriaca (fig. ^A) I but there is no evidence that it 
played a role in the HGT mechanism. It could represent an 
independent transfer of ptDNA to the mitochondrial genome 
or if it was part of a single transfer that included the rps2- 
rpoC2 spacer sequence, the intervening rpoC2 and rpod 
genes and 3' rpoB sequence (ca. 9,100 bp) have since been 
deleted. 

The two segments of the mitochondrial genome that were 
inserted into the plastome are not currently contiguous in 
A. syriaca. The arrangement observed in the plastome insert 
likely represents the ancestral mitochondrial genome 
sequence that has since been rearranged. This would be con- 
sistent with observations showing slow nucleotide substitu- 
tions in plant mitochondrial genomes but a rapid pace of 
rearrangement (Wolfe et al. 1987; Palmer and Herbon 
1988; Marechal and Brisson 2010; Knoop 2012). The pres- 
ence of sequence in the plastome inserts without homology to 
the Asclepias mitochondrial genome or sequence in GenBank 



could be due to the deletion of this sequence in the Asclepias 
mitochondrial genome, though it was present in the ancestral 
mitochondrial genome, but also leaves open the possibility 
that the mechanism of transfer may be more complicated 
and has involved multiple steps or sources of DNA, such as 
the nuclear genome. 

Due to the presence of homologous flanking sequence in 
the mitochondrial genomes of milkweeds, homologous 
recombination double strand break repair mechanisms 
(DSBR and SDSA) seem most likely as causal processes in the 
HGT. However, other mechanisms are known to operate in 
plastome repair. Single-strand annealing and microhomology- 
mediated end joining, also occur; however, these alternative 
pathways generally cause deletions, not insertions (Kwon 
et al. 2010; Marechal and Brisson 2010). A mechanism 
thought to be similar to nonhomologous end joining, which 
frequently occurs during repair of nuclear DNA in plants, 
cannot be ruled out but has only rarely been observed in plas- 
tids, and due to its rarity has been cited as a reason that HGT 
had not been observed in plastomes (Odom et al. 2008; Kwon 
et al. 2010). A final possibility is that a transposable element 
could have facilitated the horizontal transfer event, which is 
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thought to be the mechanism in the carrot mtDNA horizontal 
transfer to the plastome (lorizzo et al. 2012a). Non-LTR retro- 
transposons characteristically produce short direct repeats due 
to target site duplication (Han 2010). Direct repeats are ob- 
served at either end of the plastome insertions in Eustegia and 
Astephanus, which belong to the earliest diverging lineages of 
the Asclepiadeae; however, this mechanism is unlikely given 
the absence of sequence in the insert region homologous to 
known transposable elements. 

Regardless of how the mtDNA was incorporated into the 
plastome, it is still unknown how that DNA could have entered 
the plastid. There is no known native DNA uptake mechanism 
in plastids, and in the absence of such a mechanism, the 
double membrane of plastids and their lack of ability to 
fuse, as mitochondria do, may provide sufficient barriers to 
exogenous DNA (Richardson and Palmer 2007; Keeling and 
Palmer 2008; Bock 201 0; Kwon et al. 201 0; Smith 201 1 ) in all 
but the most exceptional cases. The uptake of foreign DNA by 
plastids can be induced through stress treatments (Cerutti and 
Jagendorf 1995) and transformation accomplished using var- 
ious methods (Maliga 2004), which further demonstrates that 
once the double membrane is breached, incorporation of for- 
eign DNA is not a difficult step and can readily occur by ho- 
mologous recombination (Marechal and Brisson 2010). 

Maintenance of the Insert over Evolutionary Time 

In higher plants, horizontally transferred DNA is generally not 
functional in the recipient genome, even if the segment con- 
tains a gene or genes, and horizontally transferred DNA is 
usually only transiently maintained in the recipient genome 
on an evolutionary time scale (Richardson and Palmer 2007; 
Keeling and Palmer 2008; Bock 2010). Even considering the 
slower rate of evolution of the plastome than the nuclear 
genome, this large insertion has been maintained, relatively 
intact, since approximately the late Eocene in all subtribes of 
tribe Asclepiadeae. Sequence comparison across these sub- 
tribes indicates that the insert is beginning to degrade, and 
this is most notable in A. syriaca, in which approximately half 
of the insert sequence has been deleted. The general mainte- 
nance of this foreign insertion raises the question of whether 
or not it confers a selective advantage that caused that plas- 
tome variant to drive to fixation in the ancestor of 
Asclepiadeae, or whether it defied long odds as a neutral 
variant and was fixed by genetic drift. 

In its current form, fitness advantages conferred by the 
horizontally transferred DNA cannot be realized at the protein 
level because there is minimal transcriptional activity in 
A. syriaca. The lack of abundance of primary or processed 
transcripts for the \fsrpl2 exon indicates that the exon is likely 
not functional as a translated protein. This is not surprising 
given that rpl2 exon 2 is likely not essential for function even in 
the mitochondrial genome (Colas des Francs-Small et al. 
2012). The insertion may still possess a less obvious role in 



post-transcriptional processing, as the mature RNAs from 
both flanking operons show large differences in final accumu- 
lated mRNA levels. Both the 5 r untranslated region of the 
rps2-atpA operon and the 3' untranslated region of the 
rpoB-rpoC2 operon appear to extend approximately 100 bp 
into the insert (fig. 6), creating a more robust target for tran- 
scriptional modification making it possible that the insert be- 
haves as a novel transcriptional enhancer by stimulating 
translational efficiency, either due to direct interactions with 
70S ribosomes (Pfalz et al. 2009) or by stabilizing specific iso- 
forms that are more readily translatable (Felder et al. 2001). 
These extensions may also serve as targets for PPR-class genes 
or sRNAs, both of which target the 5' and 3' termini of tran- 
scripts and stabilize RNAs by blocking degradation by chloro- 
plast exonucleases (Felder et al. 2001 ; Pfalz et al. 2009; Barkan 
et al. 2012). It could be telling that Eustegia, which has had 
multiple large deletions in the insert, retains the short regions 
of the insert sequence where transcription is extended in 
A. syriaca (fig. 3). However, if the insert segment is indeed 
neutral, its success is likely a consequence of "lucky insertion" 
into an intergenic spacer between two transcriptional operons 
{rpoB-rpoC2 and rps2-atpA), rather than into a location that 
would have caused disruption of an essential gene in the 
gene-dense plastome. In this case, the observed minimal tran- 
scription is likely a function of the frequency of promoters 
throughout the plastome and inefficient, stochastic transcrip- 
tion termination in plastids which contribute to transcription 
of virtually the entire plastome (Hotto et al. 201 2; Zhelyazkova 
etal. 2012). 

Continued Exchange with the Mitochondrial Genome 
through Gene Conversion 

The simultaneous phylogenetic analysis of mitochondrial 
i/rpoC2 and plastid rpoC2 sequences revealed relatively 
recent gene conversion, such that mitochondrial i/rpoC2 in 
Asclepias is more closely related to plastid rpoC2 than to 
x//rpoC2 in other Asclepiadeae. The position of the 5. afzelii 
(subfamily Secamonoideae) mitochondrial sequence sister to 
all other Asclepiadoideae mitochondrial and plastid sequences 
could be because the i/rpoC2 in this species is of independent 
origin. Another possibility is that the pseudogenes are of 
common evolutionary origin, but there was plastid to mito- 
chondrial gene conversion in the common ancestor of 
Asclepiadoideae after the divergence of Secamonoideae, 
which at that time caused the mitochondrial pseudogene to 
become more closely related to the plastid gene than the 
mitochondrial pseudogene in the sister lineage of 
Asclepiadoideae (i.e., Secamonoideae). Taken together, 
these results illustrate that the relationship between the two 
genomes is dynamic over evolutionary time. 

This phenomenon has been observed before. A compari- 
son of the maize and rice mitochondrial and plastid genomes 
that revealed that there has been ongoing intergenomic 
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exchange through gene conversion (Clifton et al. 2004) since 
an ancient transfer of ptDNA to the mitochondrial genome 
that likely occurred before the divergence of gymnosperms 
and angiosperms (Wang et al. 2007). Gene conversion 
among native mitochondrial genes, those of plastid origin, 
and with exogenous DNA has been observed on a finer 
scale leading to mosaicism in mitochondrial genes, which 
can be a significant generator of diversity in plant mitochon- 
drial genomes (Hao and Palmer 2009; Hao et al. 2010) and 
may be a cause of phylogenetic incongruence (Hao and 
Palmer 2011). It is possible that such interplay between ge- 
nomes has affected the evolution of the plastome as well and 
could be a contributor to the variability we observed at the 3 r 
end of plastid rpoC2 in Apocynaceae. 

Conclusion 

The evidence provided in this study for HGT between the 
mitochondrial and plastid genomes of milkweeds, combined 
with the results of lorizzo et al. (2012b) showing a transpos- 
able element-mediated HGT of mtDNA into the plastome of 
carrot, confirms that DNA transfer from land plant mitochon- 
drial genomes to plastomes is possible and can occur by more 
than one mechanism. The sequencing of additional land plant 
plastomes may reveal that horizontal or intracellular gene 
transfer into these genomes is not as rare as was previously 
thought and gives further insight into these and other mech- 
anisms of DNA transfer between plant mitochondrial and plas- 
tid genomes. 

Supplementary Material 

Supplementary figure S1, table S1, data sets S1 and S2 are 
available at Genome Biology and Evolution online (http:// 
www.gbe.oxfordjournals.org/). 
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